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1. INTRODUCTION 

The information technology industrial revolution 4.0 is growing very rapidly, which can be seen 
from various sciences such as expert systems, machine learning, fuzzy logic. Artificial intelligence 
applications are widely used in the world of health, where the field of artificial intelligence, especially expert 
systems can adopt the expertise of an expert in an application. The shortage of medical personnel, especially 
heart disease specialists, is the highest cause of death due to heart disease. This study aims to provide an 
alternative solution to solve the limitations of heart disease specialist medical personnel by using a 
classification model of heart disorders using computational methods. The computational method with the 
C4.5 Algorithm approach to diagnose heart disorders has advantages, namely the accuracy of prediction of 
the model's ability to predict class labels for new data, besides that it has advantages in terms of speed and 
efficiency of computation time. 

In addition to using a computational method with a C4.5 decision tree approach to find cardiac 
disorders, many approaches are used to find cardiac disorders, such as the use of an expert system with a 
certainty factor approach that is built based on a person's expertise. which was adopted into an application 
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[1]-[3] diagnosis of heart disease with decision tree algorithm for prediction and classification [4]-[16], 
development of an android-based diagnosis system for heart disorders [17], Fuzzy Expert System for 
Diagnosing Heart Disease [18], authentication method in utilizing electrocardiogram (ECG) wave features 
[19], application of IoT for disease diagnosis [20]-[23], single lead ECG classification with approach deep 
learning [24]—[26], signal analysis system for ECG authentication system [27], improve the diagnosis of heart 
disease with particle swarm optimization (PSO) Algorithm evolutionary approach and neural network [28], 
improving the heart disease diagnosis by evolutionary algorithm of PSO and feed forward neural network 
[29]. ECG classification using the k-nearest neigbor (KNN) approach [30]. An expert system for automated 
identification of obstructive sleep apneafrom single-lead ECG using random under sampling boosting [31]. 
Cardiology expert system based on electrocardiogram data using factors with multiple rules [32], cognitive 
map of certainty (CCM) to assess the causality of cognitive maps using the factor certainty for heart failure 
[33]. The use of support vector machine (SVM) method and adaboost such as detection of ventricular 
fibrillation rhythm by using a supported vector machine amplified with optimal combination of variables 
[34]. Two novel methods for multiclass ECG arrhythmias classification based on principal component 
analysis (PCA), fuzzy support vector machine and unbalanced clustering [35], A multi-class ECG beat 
classifier based on the truncated a truncated karhunen-loeve transform (KLT) Representation [36], multi- 
class ECG beat classification based on a gaussian mixture model of karhunen-loéve transform [37], Research 
on rhythmic ventricular fibrillation (VF) detected using a new approach involving support vector machine 
algorithm SVM, adaptive enhancement (Boost) and evolutionary differential algorithm (DE) with the help of 
optimal combination of variables. The end of our method is that it takes up less memory and can be 
implemented in real-time. In the health sector, especially diseases that are used to diagnose someone with an 
indication of heart disease, one of them is by looking at the patient's medical record in the form of 
electrocardiogram (ECG) data. 

Research with the classification method using the decision tree has been done by many previous 
researchers. Comparative studies have been carried out to determine the decision tree model as done [38]. 
The decision tree approach is to determine the results of patients diagnosed with heart disease based on 
electrocardiogram medical record data. This research is expected to be able to make a positive contribution to 
strengthening doctor's confidence when diagnosing the type of heart defect correctly from the results of the 
electrocardiogram medical record. 

This study aims to provide alternative solutions to overcome the limitations of medical personnel in 
the field of heart disease specialists, the data used is electrocardiogram medical record data, so this study uses 
a classification model of heart disorders using computational methods. Computing method with decision tree 
approach with C45 Algorithm. System testing 250 data as training data and 50 data used as test data. From 
the results of calculations and testing, the mean squared error (MSE) value is 0.24, the root mean squared 
error (RMSE) value is 0.49, and the accuracy value with the C4.5 algorithm is 75.33%. Medical 
electrocardiogram, so we need the right method to perform the data classification process accurately and 
ensure data validity. 


2. METHOD 
2.1. Decision tree C.45 

Decision tree is a classification technique that is widely used in the concept of data mining [39]. A 
decision tree is a flow chart graphic that represents the decision-making process, where the flow chart graph 
resembles the shape of a tree. A decision tree can be used by a person to make difficult decisions by 
simplifying them into easier choices. Each decision tree has a node and a branch connecting each node 
(nodes). 


3. ECG COMPONENTS 

An electrocardiogram (EKG) is a graph of the results of the ECG medical record that is generated 
from the patient's heart rate. The ECG signal is a signal (time and voltage) that describes the activity of the 
heart during sequential contact. Patients who experience symptoms of heart disease such as chest pain, 
fatigue, weakness, palpitations will undergo an EKG test. This test aims to detect heart abnormalities, such as 
heart attacks, coronary heart disease, and evaluate the effectiveness of the pacemaker used. ECG examination 
has an important role in helping to diagnose heart disease, but other tests are still needed to be more certain. 


4. RESEARCH METHOD 
The initial stage of research is to analyze the problem. The next step is data collection, the research 
data is in the form of electrocardiogram medical record data, the data used is 300 patient data that has been 
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processed for data preparation, so the data is ready for analysis. Sixteen parameters used in this study were 
based on electrocardiogram medical record data. The next process is 300 electrocardiogram medical record 
data which is divided into two parts, 250 training data and 50 test data. The next step is to classify the cardiac 
abnormality model using the C.45 decision tree approach. The final stage is interpreting the results of the 
classification. 


5. RESULTS AND DISCUSSION 
Determination of dominant (significant) attributes can be done by calculating the value of 
information Gain (Information gain), 


Sy, 
Information Gain (S, F;) = Entropy (S) — eee . Entropy (Sy,) 
J 


where Vfj is the set of all possible values of an attribute Fj and Sy; is a subset of S, where F; has the value vj. 
Calculation of the value of Entropy can be done using the Shannon Entropy equation, 


Entropy (S) = Yia1 —Pi- !092 (Yi) 


where pi is the proportion of the training sample for class 1. 

The stages of making a decision tree are, 

The first step, determine the number of positive (Normal) and negative (Abnormal) sets. In this case, the 
number of positive sets (Normal) is 139 while the negative set (Abnormal) is 111. 

The second step, calculate the entropy value of the training sample S based on its positive and negative 
decisions. 


139 139 111 111 
Entropy (S) = — 550 1082 ( ) = are. ( ) = 0.99 


250 250 


The third step, calculate the entropy value of each attribute against entropy S. The entropy 
calculation is carried out in the decision attribute class. Entropy calculation for each attribute value for each 
calculation result is shown in Table 1. 


Table 1. Entropy calculation for each attribute value 


Attribute Entropy Attribute Entropy Attribute Entropy 
value Value value 
HR HR < 120 0.88 AXIS AXIS < 147 0.96 II-III AVF Wave T: Normal =169 0.96 
attribute attribute: 
HR >= 120 0.2 AXIS >= 147 0.91 Abnormal = 81 0.95 
P-R P-R < 268.5 0.96 RV6 RV6 < 2.05 0.95 T Wave I-AVL Normal =194 0.96 
attribute attribute Lead 
P-R >= 268.5 1 RV6 >= 2.05 0.95 Abnormal = 56 0.95 
QRS QRS < 233.5 0.96 SV1 SV1 <0.5 0.97 T Wave V2-V6 Normal =198 0.95 
attribute attribute Lead 
QRS >= 233.5 0.91 SV1>=0.5 0.96 Abnormal = 52 0.96 
QT QT >= 266 0.50 R+S R+S < 2.56 0.93 P Wave II_VI Lead Normal =164 0.97 
attribute attribute 
R+S >= 2.56 0.95 Abnormal = 85 0.95 


a) Calculating the value of entropy (HR): The HR attribute has two types of value <120 as many as 198 
pieces, > = 120 as many as 52 pieces. Thus, the calculation of the entropy value for each attribute value 
is obtained, <120 = 198 pieces consisting of: normal = 137 Abnormal = 61 


Entropy (HR < 120) = — “log; (—) — ““Iog, (—) = 0.88 


198 198 198 198 


> = 120 = 52 pieces consisting of: normal = 2 abnormal = 50 


2 2 50 50 
Entropy (HR = 120) = — = log, (2) = log, (=) = 0.2 
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b) Calculating the value of entropy (P-R): The P-R attribute has two types of value <268.5, as many as 247 
units, > = 268.5 as many as 4 pieces. Thus, the calculation of the entropy value for each attribute value 
is obtained, <268.5 = 247 pieces consisting of: normal = 137 abnormal = 109 


109 09 
) = 0.96 


137 
Entropy (PR < 191.5) = —FFio go (=)- 547 1082 Ga 


> = 268.5 = 4 pieces consisting of: normal = 2 Abnormal = 2 
2 2) 2 2 
Entropy (PR < 191,5) = = 7 loge (=) — jlog. (7) =1 
c) Calculating the entropy value (QRS): The QRS attribute has two types of value <233.5 as many as 247 


pieces, > = 233.5 as many as 3 pieces. Thus, the calculation of the entropy value for each attribute value 
is obtained, < 233.5 = 247 pieces consisting of: normal = 137 Abnormal = 110 


Entropy (QRS < 233.5) = —=“log, (—) — —“log, (=) = 0.96 


247 247 247 247 


QRS > = 233.5 = 3 pieces consisting of: normal = 2 abnormal = 


Entropy (QRS >= 233.5) = —=log, (2) — log, (+) = 0.91 


d) Calculating the value of entropy (QT): The QT attribute has two types of Value <266, 9 pieces, > = 266 
as many as 3 pieces. Thus, the calculation of the entropy value for each attribute value is obtained, 
> = 266 = 247 pieces consisting of: normal = 137 abnormal = 104 


aa)7 104 


a) = 0.95 
247 Wa 


Entropy (QT, >= 266) = — syle 082 (5 547 022 Gr 


<266 = 9 pieces consisting of: normal = 2 abnormal = 7 
Entropy (QT < 266) = —=log, (-) — “log, (2) = 0.50 


e) Calculating the entropy value (QTC): The QTC attribute has two types of value <425.5 as many as 
106 pieces, > = 425.5 as many as 3 pieces. Thus, the calculation of the entropy value for each attribute 
value is obtained, <425.5 = 106 pieces consisting of: normal = 62 abnormal = 43 


Entropy (QTC < 425.5) = ——“log, (“) - log, (==) = 0.96 


> = 425.5 = 146 pieces consisting of: normal = 78 abnormal = 68 


Entropy (QTC >= 425.5) = — log, (~) - “log, (=) =0.97 


146 146 


f) Calculating the value of entropy (AXIS): The AXIS attribute has two types of value <147, 247, > = 147 
as many as 3 pieces. Thus, the calculation of the entropy value for each attribute value is obtained, 
< 147 = 247 pieces consisting of: normal = 138 abnormal = 109 


138 138 109 109 
Entropy (AXIS < 147) = —=“log, (=) - “log, (<) = 0.96 


> = 147 =3 pieces consisting of: normal = 1 abnormal = 2 


Entropy (AXIS >= 147) = —+log, (+) — =log, (=) = 0.91 


g) Calculating the entropy RV6 value: The attribute RV6 has two types of value <2.05, as many as 213 
pieces, > = 2.05 as many as 40 pieces. Thus, the calculation of the entropy value for each attribute value 
is obtained, <2.05 = 213 pieces consisting of: normal = 124 abnormal = 89 
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124 124 89 89 
Entropy (RV6 < 2.05) = — TE log, ( ) Tae log, (=) = 0.95 


213 


> = 2.05 = 37 pieces consisting of: normal = 15 abnormal = 22 
Entropy (RV6 >= 2.05) = —=log, (= ) - log, (- =)= = 0.95 


h) Calculating the value of entropy SV1: The SV1 attribute has two types of value <0.5, as many as 
92 pieces, > = 0.5 as many as 158 pieces. Thus, the calculation of the entropy value for each attribute 
value is obtained, <0.5 = 92 pieces consisting of: normal = 48 abnormal = 44 


Entropy (SV1 < 0.5) = -= log, (=) -5 Ses 5 log, (: *) = = 0.97 


> =0.5 = 158 pieces consisting of: normal = 92 abnormal = 66 


66 


aa. 82 (S)= = 00 


i) Calculating the entropy R + S value: The attribute R + S has two types of value <2.56 totaling 
188 pieces, > = 2.56 totaling 62 pieces. Thus, the calculation of the entropy value for each attribute 
value is obtained, <2.56 = 188 pieces consisting of: normal = 113 abnormal = 75 


Entropy (SV1 >= 0.55) = -=lo 082 ar 


113 113 
Entropy (R+ S < 2.56) = — 199 1082 ( )-2 ~- log, (=) = 0.93 


188 


> = 2.56 = 62 pieces consisting of: normal = 36 abnormal = 26 
Entropy (R+ S >= 2.56) = —~log, (= )- lo O82 (= =) = = 0.95 


j) Calculating the value of Entropy Lead If VI P Wave: Attribute Lead II VI P wave has two types of 
value, namely normal as many as 165 pieces, abnormal 85 as many as fruit. Thus, the calculation of the 
entropy value for each attribute value is obtained, Normal = 164 pieces consisting of: normal to normal 
= 90 normal to abnormal = 74 


74 74 


Entropy (Normal) = — ~~ 1082 (=) Berri log, (=) = 0.97 


Abnormal = 85 fruit consisting of: normal to abnormal = 50 abnormal to abnormal = 35 


Entropy (Abnormal ) = — ~* log2 (= \-3 log, (= =) = = 0.95 


k) Calculating the value of entropy I_III_AVF wave T: Attribute II_IN_AVF T wave has two types of 
value, namely normal as many as 169 fruits, abnormal 81 as many as fruit. Thus, the calculation of the 
entropy value for each attribute value is obtained, Normal = 169 pieces consisting of: normal to normal 
= 89 normal to abnormal = 80 


Entropy (Normal ) = — = log, (=)- = lo 0g (=) = = 0.96 


69 169 


Abnormal = 81 pieces consisting of abnormal to normal = 50 abnormal to abnormal = 31 


Entropy (Abnormal ) = — i log, (= )-3 log, (; ~) = = 0.95 


1) Calculating the T wave, I_AVL entropy lead value: Attribute lead IAVL T wave has two types of 
value, namely normal as many as 194 pieces, abnormal 56 as many as fruit. Thus, the calculation of the 
entropy value for each attribute value is obtained, Normal = 194 pieces consisting of: normal to normal 
= 116 normal to abnormal = 78 
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Entropy (Normal) = — = log, (=*)-2 log, (=) = 0.96 


Abnormal = 56 pieces consisting of abnormal to normal = 23 abnormal to abnormal = 33 


33 


Entropy ( Abnormal ) = -=lo 0g (=)-; 56 082 (= =) = = 0.95 


m) Calculating the value of entropy Lead V2-V6 Wave T: Lead V2-V6 T waves have two types of value, 
namely 198 normal and 52 abnormal. Thus, the calculation of the entropy value for each attribute value 
is obtained, Normal = 198 pieces consisting of: normal to normal = 118 normal to abnormal = 80 


80 


Entropy (Normal) = — =, 1og2 (=) Gea log, (=) = 0.95 


Abnormal = 52 pieces consisting of abnormal to normal = 20 abnormal to abnormal = 32 


Entropy (Abnormal ) = — log, (= \-+ log, (= =) = = 0.96 


n) Calculating the value of entropy Lead II_II_AVF segment ST: Lead II_IN]_AVF segment ST has two 
types of value, namely 248 normal and 2 abnormal. Thus, the calculation of the entropy value for each 
attribute value is obtained, Normal = 248 pieces consisting of: normal to normal = 138 normal to 
abnormal = 110 


138 138 110 110 
Entropy (Normal ) = — 5a 1082 (=) — Fyq O82 (=) = = 0.96 


Abnormal = 2 pieces consisting of abnormal to normal = 1 abnormal to abnormal = 1 


Entropy (Abnormal ) = — ~log, (- ) _ “log, (- 1 1 


0) Calculating the value of entropy Lead I_AVL Segment ST: The I_AVL lead segment ST has two types 
of value, namely 248 normal and 2 abnormal. Thus, the calculation of the entropy value for each 
attribute value is obtained, Normal = 248 pieces consisting of: normal to normal = 138 normal to 
abnormal = 110 


138 138 110 110 
Entropy (Normal ) = — 5a 1082 (=) — Fyq O82 (=) = = 0.96 


Abnormal = 2 pieces consisting of abnormal to normal = 1 abnormal to abnormal = 1 


Entropy ( Abnormal ) = — slog, (+) = slog, (¢) =1 


p) Calculating the value of entropy V2-V6 segment ST: V2-V6 segment ST has two types of value, 
namely 241 normal and 9 abnormal. Thus, the calculation of the entropy value for each attribute value is 
obtained, Normal = 241 pieces consisting of: normal to normal = 134 normal to abnormal = 107 


ae 107 (i 
zai) 241 082 \oa1 


Entropy (Normal ) = — log, ( \= = 0.96 


abnormal = 9 pieces consisting of abnormal to normal = 5 abnormal to abnormal = 4 


Entropy (Abnormal ) = — * log, (2) = =logs (2) = 0.96 


The fourth step, calculate the information gain value for each attribute to determine which attribute is more 
dominant and will serve as the root tree. The results of the calculation of gain information for each attribute 
are obtained, 
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Sul 


Ts x Entropy (Sy, ) 


Information Gain (S, F;) = Entropy (S) — ve 


ViEVE; 


198 


HR attribute:Information Gain (S, HR) = 0.97 — {= x 0.88 + = x 0. ve = 0.24 


P-R attribute:Information Gain (S, P — R) = 0.97 — {= x 0.96+ — x1 } = 0.01 


QRS attribute:Information Gain (S, QRS) = 0.97 — { x 0.96 + — x 0 91} = = 0.02 


QT attribute:Information Gain (S, QT) = 0.97 — fs x 0.95 + = x 0.50 } = 0.01 


106 146 


QTC attribute:Information Gain (S, QTC) = 0.97 — {=< x 0.96 + x0. 97 | =0.01 


AXIS attribute:Information Gain (S, AXIS) = 0.97 — {= x 0.96 + = x 0. 91} = = 0.01 


RV6 attribute:Information Gain (S,RV6 ) = 0.97— = x 0.95 + x 0.95 } =(.02 


SV1 attribute:Information Gain (S,SV1 ) = 0.97 — {= x 0.97 + = x0. 96 } = 0.01 


R+S attribute: Information Gain (S,R +S) = 0.97 — {= x 0.93 += x 0, ae = 0.04 


P Wave II_VI Lead Attribute: (S, P Wave II_VI Lead) = 0.97 — os x 0.97 + = x0. ‘. = 0.01 


T Wave AVF Lead I_lAttribute: (S, T Wave AVF Lead IJ_III:) = 0.97 — {= x 0.96 += rs 0. 95} = 0.01 


T Wave Lead I-AVL Attribute: (S, T Wave I — AVL Lead) = 0.97 — {= x 0.96 += = x 0.95 } = 0.02 


T Wave V2-V6 Lead Attribute: (S, T Wave V2 — V6 Leads) = 0.97 — {= x 0.95 += seale 0. 96 | = 0.03 


Attribute Lead II-III AVF Segment ST: (S, Lead II — II] AVF Segment ST) = 0.97 — e x 0.96 + 


= x1} =0.01 


250 


Lead LAVL Segment ST Attributes:(S, Lead I — AVL Segment ST ) = 0.97 — = x 0.96 + = x i}= 
0.012 


Attribute V2-V6 Segment ST: (S, V2 — V6 Segment ST) = 0.97 — {= x 0.96 + x 0.96 } = 0.02 


The fifth step, derive a subset for the root tree based on the attribute values of the attributes selected as the 
root tree. In this step, the attributes that will be used as root are selected based on the highest gain 
information value. The gain information for each attribute from the first calculation is shown in Table 2. 


Table 2. Information on the gain of each yield attribute 


Attribute Gain information Attribute Gain information Attribute Gain information 
HR attribute 0.24 AXIS attribute: 0.01 ILI AVF Wave T: 0.01 
P-R attribute 0.01 RV6 attribute 0.02 T Wave I-AVL Lead 0.04 
QRS attribute 0.02 SV1 attribute 0.01 T Wave V2-V6 Leads 0.01 
QT attribute 0.01 R+S attribute 0.01 Lead II-III] AVF Segment ST 0.01 
QTC attribute 0.01 P Wave II_VI Lead 0.02 Attribute V2-V6 Segment ST 0.02 
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Based on the calculation of gain information, it can be seen that the highest gain value is owned by 
the HR attribute, so that the HR attribute is used as the root. Meanwhile, the decision node is seen based on 
the attribute value of HR. The initial shape of the decision tree is based on the information gain value as 
shown in Figure 1. 

The calculation is continued until a complete decision tree is formed so that it represents the pattern 
of classification of heart disease data according to available ECG data. Figure 2 overall decision tree. The 
calculation for the formation of a decision tree in more detail is shown in Figure 2. 


Figure 1. Initial decision tree 
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Figure 2. Overall decision tree 


5.1. Results evaluation 

C4.5 data mining algorithm is used to classify and predictive. The C4.5 decision tree is easy to 
interpret and the fastest compared to other algorithms. Evaluation of results is assisted by Python language. 
The following results from the python language are shown in Table 3. 


Table 3. Result evaluation 
Mean square error (MSE) Root means square error (RMSE) Success rate 
0.24 0.49 75.33 % 
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6. CONCLUSION 

The pattern of classification rules formed from training data is influenced by the distribution of data 
referring to cardiac abnormalities. The more data distribution, the better the classification rule pattern 
extraction will be, conversely, if the less data distribution, then the classification rule pattern extraction is not 
formed well. In this study, the data were tested using computational methods. The method uses the C4.5 
decision tree approach. Tests were carried out from the results of system validity, system testing 250 data as 
training data and 50 data used as test data. From the results of calculations and testing, the MSE value is 0.24, 
the RMSE value is 0.49, and the accuracy value with the C4.5 algorithm is 75.33%. 
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